CUDAプログラミングガイド：CUDAカーネル開発の基礎

CUDAカーネル開発は、カーネルという特殊なC++関数の定義から始まります。この関数は、 NVIDIA GPU膨大なコア数を持つGPU上で並列実行されるように設計されています。これらの関数は、CUDAプログラミングモデルにおける基本的な作業単位であり、シリアライズされたホストロジックがマッスィブに並列なデバイス実行へと移行する橋渡しの役割を果たします。

1. global 指定子

この __global__ 宣言指定子は、コンパイラにGPU用コードを生成させつつ、関数のエントリポイントをCPUからも参照可能にするために必須のAPI修飾子です。 GPUで実行され、ホストから呼び出せる関数はすべてカーネルと呼ばれます。

2. 実行環境

カーネルは ストリーミングマルチプロセッサ（SM）に配信・実行されます。SMは、数百もの同時実行スレッドを管理する、NVIDIA GPU内の主要な計算エンジンです。各SMはスレッドブロックを処理し、それらを処理コアにスケジューリングします。

構文ルール： カーネルは厳密に voidを返す必要があります。ホストとは非同期で動作するため、直接的にCPUに値を返すことはできません。結果は割り当てられたデバイスメモリに書き戻す必要があります。

TERMINALbash — 80x24

> Ready. Click "Run" to execute.

QUESTION 1

What is the primary function of the __global__ specifier?

It defines a function that runs on the CPU but is callable from the GPU.

It defines a kernel that runs on the GPU and is callable from the CPU.

It allocates memory on the GPU's SM cache.

It synchronizes all threads in a block.

✅ Correct!

Correct! __global__ is the bridge used to launch kernels from Host code.

❌ Incorrect

Incorrect. __global__ specifically identifies entry-point kernels for GPU execution called by the Host.

QUESTION 2

Why must CUDA kernels return void?

Because they execute asynchronously and have no direct path to return values to the Host thread.

To save registers on the SM.

Because GPU memory is read-only.

The NVCC compiler does not support float returns.

QUESTION 3

Which hardware component is responsible for managing and executing threads in a CUDA kernel?

The PCIe Controller.

The Streaming Multiprocessor (SM).

The Host RAM controller.

The BIOS.

QUESTION 4

What happens when a Host calls a kernel function?

The CPU halts until the GPU finish processing.

The GPU creates a clone of the function for every available SM.

The kernel is enqueued for execution on the GPU, and the CPU continues to the next instruction.

The CPU performs a context switch to the GPU.

QUESTION 5

Which of the following is the correct definition of a CUDA kernel?

A function that executes on the GPU and is invoked from the Host.

A C++ library for file I/O.

A hardware driver for NVIDIA GPUs.

A standard CPU function with the __gpu__ prefix.

1. __global__ 指定子

2. 実行環境

1. global 指定子